Search CORE

21 research outputs found

Enhanced Inversion of Schema Evolution with Provenance

Author: Auge Tanja
Heuer Andreas
Publication venue
Publication date: 24/11/2022
Field of study

Long-term data-driven studies have become indispensable in many areas of science. Often, the data formats, structures and semantics of data change over time, the data sets evolve. Therefore, studies over several decades in particular have to consider changing database schemas. The evolution of these databases lead at some point to a large number of schemas, which have to be stored and managed, costly and time-consuming. However, in the sense of reproducibility of research data each database version must be reconstructable with little effort. So a previously published result can be validated and reproduced at any time. Nevertheless, in many cases, such an evolution can not be fully reconstructed. This article classifies the 15 most frequently used schema modification operators and defines the associated inverses for each operation. For avoiding an information loss, it furthermore defines which additional provenance information have to be stored. We define four classes dealing with dangling tuples, duplicates and provenance-invariant operators. Each class will be presented by one representative. By using and extending the theory of schema mappings and their inverses for queries, data analysis, why-provenance, and schema evolution, we are able to combine data analysis applications with provenance under evolving database structures, in order to enable the reproducibility of scientific results over longer periods of time. While most of the inverses of schema mappings used for analysis or evolution are not exact, but only quasi-inverses, adding provenance information enables us to reconstruct a sub-database of research data that is sufficient to guarantee reproducibility

arXiv.org e-Print Archive

Umsetzung von Provenance-Anfragen in Big-Data-Analytics-Umgebungen

Author: Auge Tanja
Publication venue
Publication date: 28/09/2017
Field of study

Ziel der Arbeit ist die Adaption von Techniken der Provenance-Anfragen why, where und how in Umgebungen, die statt einfacher Anfragen wie Selektion, Projektion und Verbund auch OLAP-Operationen und weitere Machine-Learning-Algorithmen benutzen. Die ausschließlich extensionalen Provenance-Antworten werden dabei durch Provenance-Polynome sowie (minimalen) Zeugenbasen gegeben. Die Erweiterung des CHASE-Algorithmus für Datenbanken um eine BACKCHASE-Phase zur Provenance-Antwort-Bewertung ermöglicht so die Bestimmung des CHASE-Inversentyps (exakt/relaxt/ergebnisäquivalent) einer gegebenen Anfrage

Rostocker Dokumentenserver

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

Ringvorlesung: Forschung @ DBIS

Author: Auge Tanja
Publication venue: Universität Rostock
Publication date: 01/01/2016
Field of study

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

Exposé eines Promotionsprojektes: Provenance Management für Data-Science-Anwendungen unter Berücksichtigung von Daten- und Schema-Evolution

Author: Auge Tanja
Publication venue: University of Rostock
Publication date: 01/01/2018
Field of study

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

Inverse im Forschungsdatenmanagement: Eine Kombination aus Provenance Management, Schema- und Daten-Evolution

Author: Auge Tanja
Heuer Andreas
Publication venue
Publication date: 01/05/2018
Field of study

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

The Theory behind Minimizing Research Data -- Result equivalent CHASE-inverse Mappings

Author: Auge Tanja
Heuer Andreas
Publication venue
Publication date: 01/08/2018
Field of study

In research data management and other applications, the primary research data have to be archived for a longer period of time to guarantee the reproducibility of research results. How can we minimize the amount of data to be archived, especially in the case of constantly changing databases or database schemes and permanently performing new evaluations on these data? In this article, we will consider evaluation queries given in an extended relational algebra. For each of the opera- tions, we will decide whether we can compute an inverse mapping to automatically compute a (minimal) subdatabase of the original research database when only the evaluation query and the evaluation result is stored. We will distinguish between different types of inverses from ex- act inverses to data exchange equivalent inverses. If there is no inverse mapping, especially for aggregation operations, we will derive the nec- essary provenance information to be able to perform the calculation of this subdatabase. The theory behind this minimization of research data, that has to be archived to guarantee reproducible research, is based on the CHASE&BACKCHASE technique, the theory of schema mappings and their inverses, and the provenance polynomials to be used for how provenance

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

Provenance Management unter Verwendung von Schemaabbildungen mit Annotationen

Author: Auge Tanja (gnd: 1176435728)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

Ziel des Promotionsprojekts ProSA (Provenance Management using Schema Mappings with Annotations) ist die Anwendung und Verallgemeinerung von Techniken des Provenance-Managements im Bereich des Forschungsdatenmanagements unter Verwendung des mit zusätzlichen Provenance-Informationen erweiterten Chase&Backchase.The goal of the PhD project ProSA (Provenance Management using Schema Mappings with Annotations) is to apply and generalize provenance management techniques in the field of research data management using Chase&Backchase enhanced with additional provenance

Rostocker Dokumentenserver

Neueste Entwicklungen der Informatik: Abschlusspräsentation Cluster Provenance

Author: Auge Tanja
Brossmann Sabrina
Wilsdorf Pia
Publication venue
Publication date: 22/09/2016
Field of study

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository

Privacy Aspects of Provenance Queries

Author: Auge Tanja
Heuer Andreas
Scharlau Nic
Publication venue
Publication date: 01/06/2020
Field of study

Given a query result of a big database, why-provenance can be used to calculate the necessary part of this database, consisting of so-called witnesses. If this database consists of personal data, privacy protection has to prevent the publication of these witnesses. This implies a natural conflict of interest between publishing original data (provenance) and protecting these data (privacy). In this paper, privacy goes beyond the concept of personal data protection. The paper gives an extended definition of privacy as intellectual property protection. If the provenance information is not sufficient to reconstruct a query result, additional data such as witnesses or provenance polynomials have to be published to guarantee traceability. Nevertheless, publishing this provenance information might be a problem if (significantly) more tuples than necessary can be derived from the original database. At this point, it is already possible to violate privacy policies, provided that quasi identifiers are included in this provenance information. With this poster, we point out fundamental problems and discuss first proposals for solutions

arXiv.org e-Print Archive

Universität Rostock, Lehrstuhl Datenbank- und Informationssysteme: Dbis Repository